Remove QueryID to fix #549 #761

benjimin · 2023-02-28T03:12:39Z

If users follow the RDS documentation included in this repo, they will use this SQL yaml and encounter a cardinality explosion, because 19 separate metrics are stored for each new query that gets performed. Removed queryid to fix #549.

Signed-off-by: Ben Lewis <[email protected]>

sysadmind · 2023-03-05T20:21:58Z

@SuperQ What are your thoughts on this?

SuperQ · 2023-03-05T22:20:07Z

The query id is the primary reason that this collector exists, so, not really useful without it.

benjimin · 2023-03-05T23:21:57Z

@SuperQ could you share a bit of explanation for why you want the query id in metrics (as opposed to logs)?

(What would the benefit be, of exporting a separate time series for each query id - given that the query id generally will never reoccur and thus the values in the time series will never fluctuate?)

This kind of flies directly in the face of Prometheus design and best practice. See, for example, the admonition from the prometheus documentation:

CAUTION:
Remember that every unique combination of key-value label pairs represents a new time series, which can dramatically increase the amount of data stored. Do not use labels to store dimensions with high cardinality (many different label values), such as user IDs, email addresses, or other unbounded sets of values.

At minimum, do you think it would be prudent to update README-RDS.md to warn the user about the consequences of using the query.yaml in the manner currently advised (particularly since there are also significant financial risks noted in the #549 issue discussion)?

SuperQ · 2023-03-06T08:28:47Z

Because in many applications, the number of query ID patterns is fairly limited, and the number of database instance is also much more constrained compared to the number of application servers. This makes the total cardinality acceptable for a lot of use cases.

Knowing the about the performance characteristics of the top X queries can be extremely useful, which is why it's in the example file.

We're planning to remove support for the queries.yaml by moving the various queries into Go code. The whole feature is an anti-pattern that was added to this exporter before it was adopted by the community. As you noticed, people blindly copy the file into production.

I don't think this has anything to do with RDS or not. Any application can produce a lot of series.

What would be OK with me would be to simply comment out the whole query in the yaml file. This way it's not on by default for users that are prone to copy-pasta and don't read through the file to see what it does. I also made some suggested changes to the query but never bothered to update it. That's one of the reasons why we want to eliminate this example file. Good improvements get made locally and never make it upstream.

Remove QueryID to fix prometheus-community#549

ffff61a

Signed-off-by: Ben Lewis <[email protected]>

SuperQ closed this Mar 5, 2023

benjimin mentioned this pull request Mar 7, 2023

Reduce cardinality of pg_stat_statements #765

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove QueryID to fix #549 #761

Remove QueryID to fix #549 #761

benjimin commented Feb 28, 2023

sysadmind commented Mar 5, 2023

SuperQ commented Mar 5, 2023

benjimin commented Mar 5, 2023

SuperQ commented Mar 6, 2023

Remove QueryID to fix #549 #761

Remove QueryID to fix #549 #761

Conversation

benjimin commented Feb 28, 2023

sysadmind commented Mar 5, 2023

SuperQ commented Mar 5, 2023

benjimin commented Mar 5, 2023

SuperQ commented Mar 6, 2023